16 research outputs found

    Final lengthening - a consequence of articulatory and perceptual restrictions?

    Get PDF
    No abstrac

    Estimating Performance of Pipelined Spoken Language Translation Systems

    Full text link
    Most spoken language translation systems developed to date rely on a pipelined architecture, in which the main stages are speech recognition, linguistic analysis, transfer, generation and speech synthesis. When making projections of error rates for systems of this kind, it is natural to assume that the error rates for the individual components are independent, making the system accuracy the product of the component accuracies. The paper reports experiments carried out using the SRI-SICS-Telia Research Spoken Language Translator and a 1000-utterance sample of unseen data. The results suggest that the naive performance model leads to serious overestimates of system error rates, since there are in fact strong dependencies between the components. Predicting the system error rate on the independence assumption by simple multiplication resulted in a 16\% proportional overestimate for all utterances, and a 19\% overestimate when only utterances of length 1-10 words were considered.Comment: 10 pages, Latex source. To appear in Proc. ICSLP '9

    Inclusion of a prosodic module in spoken language translation

    No full text
    Current speech recognition systems mainly work on statistical bases and make no use of information signalled by prosody, i.e. the segment duration and fundamental frequency contour of the speech signal. In more advanced applications for speech recognition, such as speech-to-speech translation systems, it is necessary to include the linguistic information conveyed by prosody. Earlier research has shown that prosody conveys information at syntactic, semantic and pragmatic levels. The degree of linguistic information conveyed by prosody varies between languages, from languages such as English, with a relatively low degree of prosodic disambiguation, via tone-accent languages such as Swedish, to pure tone languages. The inclusion of a prosodic module in speech translation systems is not only vital in order to link the source language to the target language, but could also be used to enhance speech recognition proper.  Besides syntactic and semantic information, properties such as dialect, sociolect, sex and attitude etc is signalled by prosody. Speech-to-speech recognition systems that will not transfer this type of information will be of limited value for person-to-person communication. A tentative architecture for the inclusion of a prosodic module in a speech-to-speech translation system is presented

    Inclusion of a prosodic module in spoken language translation systems

    No full text
    corecore